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The role of the Methodology Advisory Committee (MAC) is to review and direct research 


into the collection, estimation, dissemination and analytical methodologies associated 
with ABS statistics. Papers presented to the MAC are often in the early stages of 


development, and therefore do not represent the considered views of the Australian 
Bureau of Statistics or the members of the Committee. Readers interested in the 
subsequent development of a research topic are encouraged to contact either the author 
or the Australian Bureau of Statistics. 


SYNTHESISING ESTIMATES OF INDIGENOUS CHILD HEALTH 
BASED ON THE W.A. ABORIGINAL CHILD HEALTH SURVEY 


Terry Rawnsley, Sarah Dexter and Katie Palin 
Analytical Services Branch 


ABSTRACT 


The Western Australian Aboriginal Child Health Survey (WAACHS) was the first 
large-scale epidemiological survey of Indigenous children in Australia. It provides 
detailed information about the health, mental health, education and other 
socioeconomic outcomes for Indigenous children in Western Australia. 


Given that Queensland and the Northern Territory also have a substantial Indigenous 
population, these jurisdictions would find information similar to the WAACHS most 
useful in policy development and service provision. This paper examines the 
feasibility of using the WAACHS and other nationally available datasets to model key 
indicator variables for Queensland and the Northern Territory. 


The methods and results described in this paper represent development work as at 
June 2005. Comments from the Methodology Advisory Committee (MAC) prompted 
additional analyses to investigate the validity of various methodological assumptions. 
A summary of this work is provided as an appendix to this paper, and is the subject of 
a report that will be released in early 2007. 


The models described in this paper appeared to produce promising results. However, 
additional analysis found that there is insufficient evidence to prove or disprove key 
assumptions for the synthetic estimation method. This means it is not feasible to 
derive synthetic estimates for Queensland and the Northern Territory based on the 
WAACHS. 
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1. INTRODUCTION 


1.1 Background 


The Methodology Advisory Committee (MAC) were presented a draft version of this 
paper to review a synthetic method of estimating Indigenous child health and 
wellbeing for regions in Queensland and the Northern Territory based on the Western 
Australian Aboriginal Child Health Survey (WAACHS). The methods and results 
described in this paper represent development work as at June 2005. The MAC raised 
concerns about the validity of key methodological assumptions underlying the 
estimation method. 


Further work was undertaken after the MAC meeting to analyse the validity of these 
assumptions. This additional work is the subject of a report that will be released in 
early 2007, and a summary of this can be found in Appendix D. We found that there is 
insufficient evidence to prove or disprove the assumptions underlying the synthetic 
estimation method. 


1.2 Introduction 


The Indigenous population in Australia have health outcomes far below those of the 
rest of the population (Australian Bureau of Statistics, 2001). Many of the health 
conditions suffered by Indigenous people can be linked to factors which appear at a 
very early age or even before birth (Zubrick et al., 2004b). This project explored the 
feasibility of synthesising estimates of Indigenous child health and wellbeing for 
regions in Queensland and the Northern Territory based on WAACHS. 


WAACHS was conducted by the Telethon Institute for Child Health Research (TICHR) 
from May 2000 to June 2002 and was the first large-scale epidemiological survey of 
Indigenous children and young people in Australia. The primary objective of the 
WAACHS was to identify the developmental and environmental factors affecting the 
health of Indigenous children and young people. 


Extrapolated synthetic estimates will generally not be as accurate as estimates from a 
sample survey. However, they could provide a broad indication of the distribution of 
key variables for areas (in this study each ATSIC region in Queensland and the 
Northern Territory '). Extrapolating estimates to Queensland and the Northern 
Territory is an extreme case of an out of sample estimation problem. 


1 With the abolition of ATSIC Regional Councils and the establishment by the Office of Indigenous Policy 
Coordination of regional Indigenous Coordination Centres (ICCs), changes have been made to the geographic 
regions used for producing statistics in relation to Aboriginal peoples. While it is recognised that ATSIC regions 
no longer exist, we have kept these regions to provide continuity with other WAACHS products. 
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Section 1 introduces the project and the input sought from the MAC. Section 2 


presents the nature of the problem while Section 3 describes the methods 


investigated along with some preliminary results. Section 4 concludes the paper. 


1.3 Project overview 


This feasibility study began in early 2005. At June 2005, the project team had: 


Become familiar with the WAACHS and the data sources which are seen as 
potentially having the most use in the project. Appendix B provides a brief 
overview of the data sources that could be used for estimation or validation of 
results. 


Identified appropriate estimation methods. 


Modelled a limited number of variables to understand the methods and the 
problems which will have to be addressed. 


Started testing some of the assumptions underpinning the models. 


1.4 Input sought from the Methodology Advisory Committee 


The project team sought input from MAC on the following points: 


Do the proposed methods appear to be reasonable for the problem at hand? 


Are there ways to validate the models specified for Western Australia , 
Queensland and the Northern Territory? 


How can we make best use of the auxiliary data we have available? 


Any other insights MAC may have. 
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2. THE NATURE OF THE PROBLEM 


Extrapolating estimates to Queensland and the Northern Territory represents an 
extreme case of an out of sample estimation problem. This technical problem is a 
particular form of small area estimation, where survey data is modelled to produce 
results at a fine level of disaggregation. However, there is no measured response 
variable (Y variable) for Queensland and the Northern Territory. 


Figure 2.1 is used to illustrate the problem (in its most simple linear form). In 
Western Australia we were able (via the WAACHS) to measure both the response and 
explanatory variables and specify a model. However, only the explanatory variables 
were available in Queensland and the Northern Territory. So, the parameters from 
Western Australia could be applied to the explanatory variables in Queensland and the 
Northern Territory in order to make predictions for those jurisdictions. 


2.1 Nature of the problem 


wa, = Aiwa t PowaX wa, ¥ PawaX2wa, + ewa, 
Yow, = Biwa + Powa*orp, + PawaX201p, 


wr, = Biya + BowaX ir, + BawaXonr, 


where / is each child, Bare the model coefficients and x are the explanatory variables. 
The estimates for Queensland and the Northern Territory were based on the 
relationship between the response and explanatory variables observed in Western 
Australia. 


Key assumptions are imposed when applying the Western Australian models to 
Queensland and the Northern Territory. The fundamental assumption is that the 
relationships identified for Western Australia are similar to those in Queensland and 
the Northern Territory. That is, state or territory has no impact on health outcomes 
after taking into account an individual’s social and economic circumstances. 


Our initial descriptive analysis (not presented in this paper) showed that the 
distribution and patterns of related health variables were similar across the three 
jurisdictions. For this method to be feasible, we assumed that this similarity would 
also hold for the response variables which we tried to model. However, we could not 
conclude from this descriptive analysis that the underlying factors associated with 
these variables were the same for each jurisdiction. 
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The models from the WAACHS were applied to the ABS 2001 National Health Survey 
Indigenous component (NHS(J)). This collection was chosen because it includes 
relevant health information about carers and children which could be useful in 
modelling. A number of models were specified including these health variables and 
other models that only include ‘demographic’ variables. The sampling error in the 
explanatory variables was not been taken into account. 


There is also the issue of consistency between the variables collected in the different 
data sets. That is, the concept underlying each variable is the same in the different 
data sets. It is ABS practice to collect survey data using standard definitions (and 
questions) and the WAACHS development drew on ABS standards. 


For the feasibility study, three variables were modelled to test how viable the 
estimation for Queensland and the Northern Territory is using the WAACHS. The 
variables chosen were: 


° Low birth-weight (less than 2,500 grams). 


° Self-harm (deliberately harmed self, talked about death or suicide, attempted 
suicide). 


° Tropical ear (ever suffered from runny/glue ears). 


These three variables were chosen as they represent a diverse set of variables which 
should help identify issues associated with creating synthetic estimates of Indigenous 
children health for Queensland and the Northern Territory. 
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3. METHODOLOGY 


3.1 Area level models 


The Poisson distribution is regarded as the “benchmark model for count data” 
(Cameron & Trivedi, 1998). The Poisson distribution seemed appropriate in this case 
given we modelled counts of children with certain health conditions. That is, the 
response variable (y) is a discrete random variable with a Poisson distribution with 
parameter 44>0, such that: 


e Ji 
Py; | X;) pte a Vi = 0,1, 2, é-8he 


log ui; = X;B 


where 
X; = the matrix of explanatory variables, 
B = the vector of regression coefficients. 


The Poisson model assumes that the conditional variance is equal to the mean, sy. 
Unfortunately this assumption rarely holds; instead we must relax the variance 
assumption to account for over-dispersion. The new assumption is that the 
conditional variance is given by: 


Var(y;|X;)= M+ Su; 


where g is the dispersion parameter, which would be equal to zero under a true 
Poisson distribution. However, g > 0 if the distribution of the data is over-dispersed. 
The value of r describes the form of over-dispersion. Here we used r = 1, as this is 
what SAS implements and so the variance function takes the form 44 + gu. This, of 
course, may not correctly adjust for the over-dispersion present in the data. 


While the Poisson regression appeared to be well suited to this case, the real test 
comes when the models are specified and the results examined. The number of 
children born with low birth-weight in each ATSIC region in Western Australia is the 
dependent variable in this example. The explanatory variables in the model are: 


° The number of children born to mothers who consumed alcohol during the 
pregnancy in the ATSIC region. This was collected on the WAACHS. 


° The number of children born to mothers who consumed cigarettes during the 
pregnancy in the ATSIC region. This was collected on the WAACHS. 


° The average ARIA++ score for the ATSIC region (see Appendix B for more 
information on ARIA++). This was constructed from variables on the Census. 
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° The SEIFA Index of Relative Socio-economic Advantage/Disadvantage score for 
the ATSIC region. This was constructed from variables on the Census. 


The pseudo R-squared for this model was 64.7 (based on the decomposition of 
deviance (Cameron and Windmeijer, 1997)). The results from the Poisson regression 
were somewhat disappointing. In general the model estimates were not close to the 
direct estimates from the WAACHS. 


There are a number of possible reasons why the Poisson regression performed so 
poorly. There was only a very limited number of ATSIC regions in Western Australia 
(nine) to base the model on. Also the exact form of the over-dispersion of the data 
may have been different, and not been fully accounted for in this model. 


ATSIC regions were established for administrative purposes rather than as areas 
consisting of homogenous populations (perhaps a good example of the ecological 
fallacy). In general, ATSIC regions consist of one large population centre’ and a wider 
area which is relatively sparsely populated. So the characteristics of the Indigenous 
children in the area are quite heterogeneous. Therefore, the very nature of the ATSIC 
regions could be the main reason why the area based Poisson model performed so 
poorly (relative to the person level models presented in the next section). 


The Poisson models for the other chosen variables (self-harm and tropical ear) also 
performed quite poorly. 


Overall, a number of different problems have been identified with the area based 
Poisson regression models. However, this is not the only method which could be 
suitable for this particular problem. The next section investigates using person level 
models. 


3.2 Person level models 


We also investigated modelling at the person level. While the modelling may be 
undertaken at the person level, there is no desire to use the data at this level. Rather, 
the results from the person level would be aggregated up to be reported at the ATSIC 
region or other broad breakdowns for each jurisdiction. Modelling at the person level 
allows for more differentiation between the characteristics of each child than area 
based models. 


The WAACHS is an example of a multistage clustered sample survey. A sample of 
CDs/communities (CDs in urban areas and communities in remote areas) were 
selected at the first stage. At the second stage Indigenous families were selected and 
then every Indigenous child under the age of 18 was included in the survey. 


2 The size of the service centre can vary dramatically from a population centre of a few thousand people or a 


major city of a million or more people. 
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Probability Weighted Iterative Generalised Least Squares (PWIGLS) can be used to fit a 
multilevel level model which accounts for the survey design. This method (proposed 
by Pfeffermann et al., 1998) was applied to a two level model with normally 
distributed continuous variables. TICHR extended this method to include three levels 
(CD, family, child) and to enable the fitting of a logistic regression (Zubrick et al., 
2004a). 


The person level models here take a logistic form. The general form of the model is: 


yi ~ Bernd, pi) 


logit( pj) = | Py | =X ,B 
where 


= whether child/ in area 7 was born with low birth-weight, 


> 
| 


Xj = the matrix of explanatory variables, 
B = the vector of regression coefficients. 


The explanatory variables were the child’s age, maternal age of the mother, sex of the 
child, ARIA++ (see Appendix A for explanation) and a dummy variable if more than 20 
of the population in the ATSIC region are Indigenous. The models from the WAACHS 
were then applied to the NHS(1) to produce estimates for Western Australia. The 
variables being used are conceptually the same in both data sets. 


The underlying WAACHS and NHS()) data were examined to ensure that they are 
empirically consistent. This examination revealed some differences between the 
sampling strategies for NHS(I) and WAACHS.? For example, in the Broome ATSIC 
region, the NHS() sample was four times the WAACHS sample. Not surprisingly this 
has an impact on the quality of some of the estimates. 


The coefficients from the PWIGLS were fed into the Windows Bayesian Inference 
Using Gibbs Sampling (WINBUGS) (Spiegelhalter et al. , 1997) as starting values for 
each coefficient. WINBUGS was used to create the measures of prediction error from 
posterior distributions. This method was chosen due to the complexity of calculating 
Mean Squared Errors (MSEs) and time constraints for the feasibility study. 


This modelled probability was then applied to each child in the NHS(1). The number 
of children with low birth-weight were estimated and then converted into rates. 


The Iterative Generalised Least Squares GLS) approach does not maximise a 
likelihood function and this creates difficulties in producing a pseudo R-squared. With 
the multilevel model (in this case three levels, CD — carers — children) we found that 


3 The technical issues relating to the auxiliary data used in the modelling is discussed further in Appendix A. 
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two thirds of total variance is explained by the child and the carer level. Therefore, 
does it then make sense to construct an R-squared measure that says x of the total 
variance situated at the child and carer level is explained by the model? 


Table 3.1 compares the direct estimates from the WAACHS with modelled estimates 
from the NHS(1) for each ATSIC region in Western Australia. Modelled estimates were 
obtained using a person level model. The prediction error from posterior 
distributions provides an indication of the reliability of the estimates for each ATSIC 
region. However, MSEs are still seen as the better measure of the quality of the 


estimates. 


3.1 Direct estimates from WAACHS and person model estimates from NHS(I) for low birth-weight 


WAACHS Estimate 

direct Posterior based Posterior 
ATSIC Region Lower CI estimate Upper CI Lower CI on NHS(I) Upper Cl 
Broome 6.4% 10.7% 17.1% 10.8% 14.2% 18.2% 
Derby 12.1% 17.8% 24.6% 10.1% 13.4% 17.1% 
Geraldton 8.4% 12.3% 17.6% 9.8% 10.9% 12.1% 
Kalgoorlie 3.2% 5.7% 9.2% 7.3% 10.4% 14.3% 
Kununurra 7.0% 10.1% 13.6% 10.2% 13.8% 18.2% 
Narrogin 8.7% 11.7% 15.6% 9.6% 10.8% 12.1% 
Perth 8.5% 10.7% 13.3% 9.4% 10.8% 12.2% 
South Hedland 7.6% 12.0% 17.2% 9.1% 11.0% 13.1% 
Warburton 4.5% 13.0% 28.8% 9.7% 12.6% 15.9% 


It is also possible for a model with a high goodness of fit to give predictions with high 
prediction error and vice versa. Researchers tend to focus on the prediction error 
because that is what is most important in this situation. 


Synthetic models may produce quite poor results for areas that have unique 
characteristics (that are not accounted for in the model) causing them to be at the 
extremes of the distribution. There is the option to exclude areas with extreme values 
from the model estimation. This may lead to improved estimates for the remaining 
areas, although the inclusion of areas with extreme values may improve the estimates 
for the other jurisdictions with outliers. Appendix C presents estimates of Relative 
Standard Errors and the Root Relative Posterior Variance. 


Overall the model appeared to be producing results which are quite close to the 
direct estimates from the WAACHS. The confidence interval, based on the posterior 
variance, also suggest that the estimates were relatively robust for low birth-weight. 
However, the rankings for the ATSIC regions based on the model predictions were 
quite different to those based on the direct estimates. 
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In table 3.2, the results from the modelled estimates for self-harm are compared to the 
direct estimates from the WAACHS for each ATSIC region. The explanatory variables 
in the self-harm model were the child’s age, sex of the child, ARIA+ +, SEIFA 
(advantage/disadvantage index in quartiles) and a dummy variable if more than 20 of 
the population in the ATSIC region is Indigenous. 


3.2 Direct estimates from WAACHS and person model estimates from NHS(I) for self-harm 


WAACHS Estimate 

direct Posterior based Posterior 
ATSIC Region Lower CI estimate Upper CI Lower CI on NHS(I) Upper Cl 
Broome 16.0% 26.4% 37.6% 12.8% 16.5% 20.7% 
Derby 6.2% 15.3% 32.0% 5.5% 7.3% 9.4% 
Geraldton 4.3% 8.6% 14.5% 5.9% 7.2% 8.7% 
Kalgoorlie 3.7% 6.3% 10.5% 2.2% 3.1% 4.1% 
Kununurra 3.6% 5.8% 8.6% 4.1% 6.2% 8.5% 
Narrogin 6.4% 9.9% 14.4% 6.0% 7.2% 8.6% 
Perth 10.3% 12.8% 15.9% 14.4% 16.6% 19.0% 
South Hedland 4.4% 8.2% 14.2% 4.8% 6.5% 8.2% 
Warburton 0.3% 3.1% 9.5% 6.2% 8.0% 10.0% 


When compared to the direct estimates from the WAACHS, the estimates for Broome 
and Derby were poor (although there are large confidence intervals around the direct 
estimates). Once again these are the ATSIC regions which were at the extremes of the 
distribution. As with low birth-weight the rankings for the ATSIC regions were quite 
different for the model predictions compared to the direct estimates. 


3.3 Direct estimates from WAACHS and person model estimates from NHS(I) for tropical ear 


WAACHS Estimate 

direct Posterior based Posterior 
ATSIC Region Lower CI estimate Upper CI Lower CI on NHS(I) Upper Cl 
Broome 18.6% 24.1% 60.5% 21.7% 25.8% 30.0% 
Derby 23.6% 28.0% 32.9% 22.4% 27.6% 33.1% 
Geraldton 19.2% 23.6% 28.7% 18.4% 19.9% 21.3% 
Kalgoorlie 16.0% 21.5% 28.0% 18.8% 25.6% 33.9% 
Kununurra 20.0% 24.6% 29.7% 23.0% 28.3% 33.9% 
Narrogin 15.0% 18.0% 21.3% 18.7% 20.6% 22.6% 
Perth 16.0% 18.4% 20.9% 16.5% 18.1% 19.7% 
South Hedland 19.6% 25.7% 32.8% 18.6% 21.6% 25.0% 
Warburton 23.5% 29.9% 36.6% 22.2% 26.8% 31.6% 


Table 3.3 compares the same estimates for tropical ear. The explanatory variables in 
the tropical ear model were the child’s age, sex of the child, ARIA++ and a dummy 
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variable if more than 20 of the population in the ATSIC region is Indigenous. As with 
the other two variables, this relatively simple model appeared to be producing 
reasonable results for the ATSIC regions in Western Australia. 


Overall these person level models appeared to be capable of producing viable results 
for the majority of ATSIC regions in Western Australia. The next section discusses the 
inclusion of random effects to improve the models. 


3.3 Random effects 


The inclusion of random effects could potentially improve the estimates at the ATSIC 
region level. In principle a random effect model appeared suitable for this feasibility 
study. However, the problem is how could we estimate a random effect where there 
is no sample? In this case we were applying the relationships from Western Australia 
to Queensland and the Northern Territory, and the use of random effects at the ATSIC 
region level was not possible. For example, there is not a Warburton ATSIC region (or 
comparable ATSIC Region) in Queensland and the Northern Territory. 


The ATSIC region was not the only area classification which could possibly be used. 
LORI (Level of Relative Isolation) is based on road distance to the nearest service 
centre and contains five categories of areas (see Appendix B for more information on 
LORI). The LORI categories group together areas which are far more homogenous 
than the ATSIC regions. For example, the Perth and Brisbane ATSIC regions are both 
classified as the same LORI category. So, the random effects could be included into 
the model for each LORI category, and the result will still be presented for each ATSIC 
region level. 


In general, the random effects will be included as follows: 


logit( py ) = log ee =X Brvy 


The random effect parameters vy; (for each LORI category) enter the model in a linear 
form. It would be ideal to include the random effect when running the PWIGLS 
model. However, the inclusion of random effects with PWIGLS required additional 
time and resources to accomplish. We estimated the random effects in WINBUGS. 


The Hausman test was used to indicate whether random effects are adding 
explanatory power to the model. The Hausman test is testing the hypothesis that 


E(v,|X)=0 
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If this null hypothesis test is true, then the random effects estimator is consistent and 
efficient. If the null hypothesis is rejected, the random effects estimator is not 
consistent and the fixed effects model is more appropriate. 


The random effect estimator was not consistent in the birth-weight model. Therefore, 
the results are not presented here. This was not surprising given that the probability 
of a child being born with low birth-weight should be more related to the mother’s 
individual characteristics rather than the characteristics of the area in which she lived. 
Of course, there may still be some correlation between the area and characteristics of 
the mother. For example, a mother living in a very remote area may not have easy 
access to fresh food and medical expertise which could have an impact on the 
birth-weight of a child. Or conversely living in a very remote area may help protect 
the mother (for example, if a traditional lifestyle is followed) from poor diet or drug 


and alcohol abuse. 


Table 3.4 compares the random effect model estimates with the direct WAACHS 
estimates for self harm. With the random effects included, Broome, Kununurra and 
Warburton all produced estimates which were closer to the direct estimates from the 
WAACHS. South Hedland was the only ATSIC region which was (considerably) 
adversely affected by the inclusion of random effects. We believe this is due to almost 
all of South Hedland being in the ‘High’ LORI category. A large proportion of the 
Broome ATSIC region is also in the ‘High’ LORI category. The Broome ATSIC region 
has the highest level of self-harm while South Hedland is lowest. This may be the 
source of problems for these ATSIC regions’ estimates. 


3.4 Direct estimates from WAACHS and random effect model estimates from NHS(I) for 
self-harm 


WAACHS Estimate 

direct Posterior based Posterior 
ATSIC Region Lower CI estimate Upper CI Lower CI on NHS(I) Upper Cl 
Broome 16.0% 26.4% 37.6% 13.3% 17.1% 21.4% 
Derby 6.2% 15.3% 32.0% 4.3% 6.1% 8.4% 
Geraldton 4.3% 8.6% 14.5% 5.9% 7.4% 9.1% 
Kalgoorlie 3.7% 6.3% 10.5% 2.0% 2.8% 3.8% 
Kununurra 3.6% 5.8% 8.6% 3.4% 5.2% 7.6% 
Narrogin 6.4% 9.9% 14.4% 5.6% 7.0% 8.6% 
Perth 10.3% 12.8% 15.9% 13.6% 16.1% 18.6% 
South Hedland 4.4% 8.2% 14.2% 5.7% 8.3% 11.1% 
Warburton 0.0% 3.1% 9.5% 5.0% 6.9% 9.1% 
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Table 3.5 compares the random effect model estimates with the direct WAACHS 
estimates for tropical ear. The estimates from the random effect model for tropical 


ear were quite similar to the results from the non-random effect model. However, 
Confidence Intervals are wider for the random effect model. The random effect 
model will capture more of the variation which exists between the different LORI 


categories. 


3.5 Direct estimates from WAACHS and random effect model estimates from NHS(I) for tropical ear 


Estimate 
based 
on NHS(I) 


Posterior 
Upper Cl 


WAACHS 

direct Posterior 
ATSIC Region Lower CI estimate Upper Cl Lower CI 
Broome 18.6% 24.1% 60.5% 18.5% 
Derby 23.6% 28.0% 32.9% 15.6% 
Geraldton 19.2% 23.6% 28.7% 17.9% 
Kalgoorlie 16.0% 21.5% 28.0% 14.9% 
Kununurra 20.0% 24.6% 29.7% 16.5% 
Narrogin 15.0% 18.0% 21.3% 17.9% 
Perth 16.0% 18.4% 20.9% 16.6% 
South Hedland 19.6% 25.7% 32.8% 18.3% 
Warburton 23.5% 29.9% 36.6% 17.0% 


3.4 Estimates for Queensland and the Northern Territory 


The models appeared to be producing reasonable results for Western Australia. 
However, we still needed to test the assumption that the models would produce 
reliable results in Queensland and the Northern Territory. To do this we needed data 
from an independent source that could be used to help validate modelled estimates. 


There are data available for birth-weight from the Australian Institute of Health and 
Welfare (AIHW). Although the scope‘ and timing? of the collections are different to 
the WAACHS, the AIHW data do provide some indication of how reliable the estimates 


might be for the two other jurisdictions. 


The model estimates for Queensland and the AIHW data were quite close for most 
ATSIC regions. However, there were some discrepancies for the Torres Strait Islands. 


For the Northern Territory, the model estimates were also promising. 


4 AIHW data relates to children born to Indigenous mothers; the WAACHS relates to indigenous children. 


5 AIHW data is for 1991-1996, while the WAACHS data is for 1984-2002. 
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4. CONCLUDING REMARKS 


This paper describes some of the key methodological issues when trying to produce 
synthetic estimates. A number of different modelling techniques were explored to 
assess their feasibility for synthetic estimation and these appeared to show promising 
results. The results in this paper represent work in progress at June 2005. At that 
stage the project team were seeking advice from MAC members on the broad 
approach taken. 


Comments at the MAC meeting suggested that the key methodological assumptions 
needed further validation. Subsequent work could neither prove nor disprove the 
validity of these assumptions, and the synthetic estimation was deemed infeasible 
given the current national data sources available. This extension was a substantial 
piece of work in its own right, and is the subject of a report to be released in early 
2007. A brief summary of this work is provided in Appendix D. 


ACKNOWLEDGEMENTS 


The authors would like to thank Daniel Elazar, Lewis Conn, Francis Mitrou, David 
Lawrence, John de Maio, Robert Tanton, Marion McEwin, Alanna Sutcliffe and 
Jonathon Khoo for their helpful comments and assistance with this research project. 


The content and presentation of the paper are much improved as a result of their 
input. Responsibility for any errors or omissions remains solely with the authors. 


14 ABS ¢ SYNTHESISING ESTIMATES OF INDIGENOUS CHILD HEALTH BASED ON THE WAACHS ° 1352.0.55.071 


ABS METHODOLOGY ADVISORY COMMITTEE * JUNE 2005 


BIBLIOGRAPHY 
An, A., Watts, D. & Stokes, M. (1999) “SAS Procedures for Analysis of Sample Survey 
Data”, Survey Statistician, December 1999, pp. 10-13. 


Australian Bureau of Statistics (2001) National Health Survey: Aboriginal and Torres 
Strait Islander Results, Australia, cat. no. 4715.0.0, ABS, Canberra. 


Cameron, A. and Trivedi, P. (1998) Regression Analysis of Count Data, Press 
Syndicate of the University of Cambridge. 


Chambers, R. and Skinner, S. (eds) (2003) Analysis of Survey Data, Wiley, New York. 
Congdon, P. (2001) Bayesian Statistical Modelling, John Wiley & Sons, Ltd. 
Congdon, P. (2003) Applied Bayesian Modelling, John Wiley & Sons, Ltd. 


Gelman, A., Carlin, J., Stern, H. and Rubin, D. (1995) Bayesian Data Analysis, 
Chapman & Hall. 


Green, W. (2003) Econometric Analysis, 5th Edition, Prentice Hall, Sydney. 
Gujuarati, D. (1996) Basic Econometrics, 3rd Edition, McGraw Hill, Singapore. 


Hosmer, D. and Lemeshow, S. (2000) Applied Logistic Regression, 2nd Edition, Wiley, 
New York. 


Maddala, G. (1992) Introduction to Econometrics, 2nd Edition, Prentice Hall, 
London. 


Pfeffermann, D., Skinner, C.J., Holmes, D.J., Goldstein, H. and Rasbash, J. (1998) 
“Weighting for Unequal Selection Probabilities in Multilevel Models” (with 
Discussion), Journal of the Royal Statistical Society, Series B: Statistical 
Methodology, 60, pp. 23-40. 


Rao, J. (2003) Small Area Estimation, John Wiley & Sons, Inc.. 


Skinner, C., Holt, D., and Smith, T. (eds) (1989) Analysis of Complex Surveys, Wiley, 
London. 


Sniijders, T. and Bosker, R. (1999) Multilevel Analysis, SAGE Publications Ltd. 


Spiegelhalter, D.J., Thomas, A., Best, N.G. and Gilks, W.R. (1999) BUGS: Bayesian 
Inference Using Gibbs Sampling, Version 6.0, Cambridge Medical Research 
Council Biostatistics Unit. 


Wooldridge, J. (2000) Introductory Econometrics: AModern Approach, 
South-Western College Publishing, Thomson Learning. 


ABS ¢ SYNTHESISING ESTIMATES OF INDIGENOUS CHILD HEALTH BASED ON THE WAACHS »° 1352.0.55.071 15 


ABS METHODOLOGY ADVISORY COMMITTEE * JUNE 2005 


Zubrick, S., Lawrence, D., de Maio, J. and Biddle, N. (2004a) Reliability of the 
Strengths and Difficulties Questionnaire: An analysis based on the Western 
Australian Aboriginal Child Health Survey, Telethon Institute of Child Health 
Research, Perth. 


Zubrick, S., Lawrence, D., Silburn, S., Mitrou, F., Blair, E., Milroy, H., Wilkes T., Eades, 
S., Read, H., Ishiguchi, P. and Doyle, S. (2004b) The Western Australian 
Aboriginal Child Health Survey: The Health of Aboriginal Children and Young 
People, Volume 1, Telethon Institute of Child Health Research, Perth. 


16 ABS ¢ SYNTHESISING ESTIMATES OF INDIGENOUS CHILD HEALTH BASED ON THE WAACHS ° 1352.0.55.071 


ABS METHODOLOGY ADVISORY COMMITTEE * JUNE 2005 


APPENDIXES 


A. ISSUES RELATED TO THE AUXILIARY DATA 


The preliminary modelling undertaken appeared to be producing encouraging results. 
However, the reliability of any modelling is heavily dependent on the quality of the 
input data. The 2001 NHS(1) collects a number of potentially useful explanatory 
variables that are common to the WAACHS, but the NHS() has quite a small sample 
size. The NHS(1) was designed to produce reliable estimates at the national level for 
persons in scope of the survey. The sample design was based on a broad dissection of 
Australia into non-sparsely settled and sparsely settled areas. Also, not all Indigenous 
children in a household were sampled in the NHS(). For example, households 
selected in non-sparsely settled areas only had one adult (aged 18 years and over) and 
up to two children aged 0-17 years selected. In households selected in sparsely 
settled areas, one adult (aged 18 years and over) and only one child aged 0-17 years 
were selected. 


Table A.1 compares the sample sizes from the WAACHS and the NHS(1) for each 
ATSIC region. Given the large sample of the WAACHS, we would not expect the same 
quality from similar breakdowns of the model estimates based on the NHS(1). For 
example, in most of the results presented in this paper, differences could be due to 
the small sample size of some ATSIC regions. Improved results may be possible with 
larger sample sizes within these regions. 


A.1 Sample sizes from the WAACHS and NHS(I) for WA 


ATSIC Region WAACHS NHS(I) 
Broome 213 51 
Derby 290 36 
Geraldton 624 36 
Kalgoorlie 325 16 
Kununurra 370 12 
Narrogin 1,020 32 
Perth 1,748 64 
South Headland 369 48 
Warburton 330 25 
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B. DATA USED BY THE PROJECT 


Western Australian Aboriginal Child Health Survey (WAACHS) 


The WAACHS was a large scale survey, conducted by the Telethon Institute of Child 
Health Research (TICHR), measuring the health and wellbeing of over 5,000 Western 
Australian Aboriginal and Torres Strait Islander children and young people and was 
run for the first time from May 2000 to June 2002. The primary objective of the survey 
was to identify the developmental and environmental factors affecting Indigenous 
children and young people. In this survey Indigenous children and young people 
includes all those 17 years and under who are identified by their carers as either 
Aboriginal or Torres Strait Islander. 


The data was collected directly from children and young people, their carers and their 
teachers, and then linked to administrative records (health and education) associated 
with the child. 


The WAACHS includes information on children’s health, wellbeing and education as 
well as demographic and social characteristics of their household and family. 
Information collected in the survey includes dental health, hearing, birth-weight, diet, 
admission to hospital, language spoken, violence in the family and mothers’ alcohol 
and drug use while pregnant. Separate data on the health and wellbeing of a child was 
also collected directly from those children aged 12 to 17 years. For further 
information, please refer to The Health of Aboriginal Children and Young People, 
Volume 1, Telethon Institute of Child Health Research. 


2001 National Health Survey Indigenous (NHS(I)) 


The NHS(1) obtained detailed information on the health status of the Indigenous 
population and was designed to produce reliable estimates at the national level. The 
survey included Indigenous people of all ages and was conducted from February to 
November 2001 and previously in 1995. The NHS() collected data from 3,198 
Aboriginal and Torres Strait Islanders (1,853 adults, 18 years and over and 1,828 
children, under 18). 


The survey was designed to collect information on a wide range of health issues of 
Indigenous Australians. The NHS(1) collected information on self assessed health 
status, use of health services, health related lifestyle aspects and demographic 
characteristics. For further information, please refer to National Health Survey: 
Aboriginal and Torres Strait Islander Results, Australia, ABS cat. no. 4715.0. 
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Census of Population and Housing 


The Australian Census of Population and Housing is conducted every five years. The 
2001 Census data used in this project was included to support the analysis at levels the 
sample surveys could not. Data is collected on a wide variety of topics, including 
demographic information and geographic detail. Estimates from the Census contain 
no sampling error as the total population is enumerated. 


National Aboriginal and Torres Strait Islander Social Survey (NATSISS) 


The NATSISS is a cross-cutting social survey of Australia's Indigenous population. It 
was first conducted in 1994. The survey was designed to enable analysis on the 
interrelationships of social circumstances and outcomes of Aboriginal and Torres 
Strait Islander Australians. The survey was conducted from August 2002 to April 2003, 
collecting information from 9,400 Indigenous Australians aged 15 years and over. 
Some basic information was also collected about the number of children under 15 
living in the same household as the survey respondent. 


Information collected from the NATSISS covered topics including family and 
community, culture and language, health, income and housing, education, 
employment, law and justice, information technology and transport. For further 
information, please refer to National Aboriginal and Torres Strait Islander Social 
Survey, ABS cat. no. 4714.0. 


Community Housing and Infrastructure Needs Survey (CHINS) 


The CHINS was designed to assist in the evaluation of policies and programs designed 
to improve housing and infrastructure services for Aboriginal and Torres Strait 
Islander peoples living in both discrete communities and in other housing managed 
by Indigenous organisations. This study used the CHINS survey that was conducted in 
conjunction with the 2001 Census. 


The 2001 CHINS collects information on: 


° The current housing stock, dwelling management practices and selected income 
and expenditure arrangements of Indigenous organisations that provide housing 
to Aboriginal and Torres Strait Islander peoples; and 


° Details of housing and related infrastructure such as water, electricity, sewerage 
systems, drainage, and rubbish collection and disposal, as well as other facilities 
such as transport, communication, education, sport and health services, 
available in discrete Aboriginal and Torres Strait Islander communities. 
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ARIA++ LORI 


ARIA was developed as an index based on the road distance people must travel from a 
given location to service centres of various sizes. A service centre is an area where 
people can access goods, services and opportunities for social interaction. The 
population size of the service centre is used as a proxy for the availability of a range of 
services. The road distance is used as a proxy for the degree of remoteness from 
those services. ARIA is based solely on physical geography and is not, by itself, 
intended to measure other factors such as social isolation, wellbeing or other 
socioeconomic factors. 


The calculation of an ARIA score involves measuring the shortest road distance 
between a populated locality and categories of service centres. A special adjustment is 
made for islands when calculating ARIA scores. Service centres are categorised based 
on the range and type of goods and services that are available. LORI is a set of 
categories of the ARIA++ scores which allows meaningful analysis of Indigenous 
health to be undertaken. 
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C. RSES AND THE ROOT RELATIVE POSTERIOR VARIANCES 


C.1 Birth-weight estimates for Western Australia 
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Broome 24.1% 
Derby 28.0% 
Geraldton 23.6% 
Kalgoorlie 21.5% 
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Narrogin 18.0% 
Perth 18.4% 
South Hedland 25.7% 
Warburton 29.9% 
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D. PROGRESS SINCE THE JUNE 2005 MAC MEETING 


A number of issues were raised at the MAC meeting, in particular the validity of the 
assumption that there is no jurisdiction effect so that models developed in Western 
Australia can reasonably be applied to Queensland and the Northern Territory. It was 
also suggested that we should investigate the use of multilevel modelling. It was 
agreed that the key assumptions need to be clearly articulated along with the criteria 
for assessing whether the assumptions hold or not. There was general support for 
investigating synthetic methods such as these so long as the quality associated with 
the methods was understood and on the basis that the key assumptions hold. 


Following this meeting, the project team focused on testing the following three key 
assumptions: 


° models with reasonable explanatory power can be developed from the WAACHS, 
for Western Australian ATSIC regions. 


° the predictor variables drawn from the WAACHS are available for Queensland 
and the Northern Territory in a comparable form from a national data set, i.e. 


they measure the same concept and were collected using similar questions. 


° relationships identified in the WAACHS data for Western Australia will be similar 
to those in Queensland and the Northern Territory. 


The validity and quality of all modelled estimates depend critically on these 
assumptions and this needs to be well understood by users. To test the third 
assumption the project team modelled the results by interacting flags for each 
jurisdiction with each of the explanatory variables. The joint significance of the 
coefficients of the interaction terms could then be statistically tested. A number of 
models were developed but most of the variables in the models were not significant 
and hence we were unable to test whether the strength and the direction of the 
relationships are consistent across jurisdictions. 


In response to the comments from the MAC members, the project team fully 
articulated the diagnostics that will be used to demonstrate the strength of the model 
fit. These include: 


° goodness of fit statistics (such the Hosmer and Lemeshow test) 


° statistical significance of the estimated coefficients for each predictor variable in 
the model 
* predictive power of the model in terms of the mean square error, tests against 


direct estimates for bias and additivity to higher level aggregates 


° plausibility of the predictor variables in terms of direction and magnitude of their 
coefficients 
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° plausibility of the predicted indicators in terms of how well their spread across 
regions is consistent with local knowledge. 


The following quality criteria were applied to each of the test variables: 
$ Can a valid model be fitted? 

° Are comparable predictor variables available in all jurisdictions? 

° Are model estimates affected by jurisdiction? 

° Are predictions from the model’s plausible? 


A series of models, including random effects and multi-level models, were developed 
and assessed against these criteria. After assessing the variables in the models, the 
2001 Census was used as the primary data source and supplemented with contextual 
variables from Statistical Subdivision, for example percentage of Indigenous people 
who smoke. This increased the level of data available for modelling and improved the 
accuracy of the synthetic estimates. 


A status report and detailed technical paper will be available early 2007. Based on our 
investigations we concluded that the evidence is insufficient to prove or disprove a 
jurisdiction effect and without this conclusive evidence we are unable to state that a 
key assumption underpinning these synthetic estimates holds. Hence, we have 
concluded that given the current national datasets and variables available it is not 
feasible to derive synthetic estimates for Queensland and the Northern Territory 
based on the WAACHS. 
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